68 research outputs found

    Privacy Against Statistical Inference

    Full text link
    We propose a general statistical inference framework to capture the privacy threat incurred by a user that releases data to a passive but curious adversary, given utility constraints. We show that applying this general framework to the setting where the adversary uses the self-information cost function naturally leads to a non-asymptotic information-theoretic approach for characterizing the best achievable privacy subject to utility constraints. Based on these results we introduce two privacy metrics, namely average information leakage and maximum information leakage. We prove that under both metrics the resulting design problem of finding the optimal mapping from the user's data to a privacy-preserving output can be cast as a modified rate-distortion problem which, in turn, can be formulated as a convex program. Finally, we compare our framework with differential privacy.Comment: Allerton 2012, 8 page

    An Exploration of the Role of Principal Inertia Components in Information Theory

    Full text link
    The principal inertia components of the joint distribution of two random variables XX and YY are inherently connected to how an observation of YY is statistically related to a hidden variable XX. In this paper, we explore this connection within an information theoretic framework. We show that, under certain symmetry conditions, the principal inertia components play an important role in estimating one-bit functions of XX, namely f(X)f(X), given an observation of YY. In particular, the principal inertia components bear an interpretation as filter coefficients in the linear transformation of pf(X)∣Xp_{f(X)|X} into pf(X)∣Yp_{f(X)|Y}. This interpretation naturally leads to the conjecture that the mutual information between f(X)f(X) and YY is maximized when all the principal inertia components have equal value. We also study the role of the principal inertia components in the Markov chain B→X→Y→B^B\rightarrow X\rightarrow Y\rightarrow \widehat{B}, where BB and B^\widehat{B} are binary random variables. We illustrate our results for the setting where XX and YY are binary strings and YY is the result of sending XX through an additive noise binary channel.Comment: Submitted to the 2014 IEEE Information Theory Workshop (ITW

    Bottleneck Problems: Information and Estimation-Theoretic View

    Full text link
    Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding and dependence dilution. Leveraging these connections, we prove a new cardinality bound for the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed as \textit{bottleneck problems}, by replacing mutual information in IB and PF with other notions of mutual information, namely ff-information and Arimoto's mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantee in a variety of inference tasks with statistical constraints on accuracy and privacy. Although the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to binary case, we derive closed form expressions for several bottleneck problems
    • …
    corecore